AITopics

2605.17678

Country: Europe (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Neural Information Processing SystemsFeb-12-2026, 01:02:42 GMT

fc95fa5740ba01a870cfa52f671fe1e4-Paper.pdf

high probability, probability, sequence, (15 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Neural Information Processing SystemsDec-25-2025, 08:05:57 GMT

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}\theta = \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$. Our analysis is based on new results regarding moments and high probability bounds for products of matrices which are shown to be tight. We derive high probability bounds on the performance of LSA under weaker conditions on the sequence $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$ than previous works. However, in contrast, we establish polynomial concentration bounds with order depending on the stepsize. We show that our conclusions cannot be improved without additional assumptions on the sequence of random matrices $\{{\bf A}_n: n \in \mathbb{N}^*\}$, and in particular that no Gaussian or exponential high probability bounds can hold. Finally, we pay a particular attention to establishing bounds with sharp order with respect to the number of iterations and the stepsize and whose leading terms contain the covariance matrices appearing in the central limit theorems.

linear stochastic approximation, name change, tight high probability bound, (5 more...)

Genre: Research Report > New Finding (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Khusainov, Marat, Sheshukova, Marina, Durmus, Alain, Samsonov, Sergey

On the Rate of Gaussian Approximation for Linear Regression Problems

arXiv.org Machine LearningSep-18-2025

In this paper, we consider the problem of Gaussian approximation for the online linear regression task. We derive the corresponding rates for the setting of a constant learning rate and study the explicit dependence of the convergence rate upon the problem dimension $d$ and quantities related to the design matrix. When the number of iterations $n$ is known in advance, our results yield the rate of normal approximation of order $\sqrt{\log{n}/n}$, provided that the sample size $n$ is large enough.

approximation, recurrence, stochastic approximation, (13 more...)

2509.14039

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.61)

Neural Information Processing SystemsAug-19-2025, 01:18:20 GMT

fc95fa5740ba01a870cfa52f671fe1e4-Supplemental.pdf

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Neural Information Processing SystemsAug-19-2025, 01:18:16 GMT

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

However, in most cases, there is a consistent gap between these two types of analyses.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(4 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Butyrin, Bogdan, Rubtsov, Artemy, Naumov, Alexey, Ulyanov, Vladimir, Samsonov, Sergey

Gaussian Approximation for Two-Timescale Linear Stochastic Approximation

arXiv.org Machine LearningAug-12-2025

In this paper, we establish non-asymptotic bounds for accuracy of normal approximation for linear two-timescale stochastic approximation (TTSA) algorithms driven by martingale difference or Markov noise. Focusing on both the last iterate and Polyak-Ruppert averaging regimes, we derive bounds for normal approximation in terms of the convex distance between probability distributions. Our analysis reveals a non-trivial interaction between the fast and slow timescales: the normal approximation rate for the last iterate improves as the timescale separation increases, while it decreases in the Polyak-Ruppert averaged setting. We also provide the high-order moment bounds for the error of linear TTSA algorithm, which may be of independent interest.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2508.07928

Country:

Asia > Middle East > Jordan (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsJan-19-2025, 15:00:33 GMT

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

This paper provides a non-asymptotic analysis of linear stochastic approximation (LSA) algorithms with fixed stepsize. Our analysis is based on new results regarding moments and high probability bounds for products of matrices which are shown to be tight. We derive high probability bounds on the performance of LSA under weaker conditions on the sequence \{({\bf A}_n, {\bf b}_n): n \in \mathbb{N} *\} than previous works. However, in contrast, we establish polynomial concentration bounds with order depending on the stepsize. We show that our conclusions cannot be improved without additional assumptions on the sequence of random matrices \{{\bf A}_n: n \in \mathbb{N} *\}, and in particular that no Gaussian or exponential high probability bounds can hold.

fixed stepsize, linear stochastic approximation, tight high probability bound, (3 more...)

Genre: Research Report > New Finding (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Samsonov, Sergey, Moulines, Eric, Shao, Qi-Man, Zhang, Zhuo-Song, Naumov, Alexey

Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

arXiv.org Machine LearningMay-26-2024

In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Our findings reveal that the fastest rate of normal approximation is achieved when setting the most aggressive step size $\alpha_{k} \asymp k^{-1/2}$. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.

approximation, exp, inequality, (16 more...)

2405.16644

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceAug-8-2021

Online Bootstrap Inference For Policy Evaluation in Reinforcement Learning

Ramprasad, Pratik, Li, Yuantong, Yang, Zhuoran, Wang, Zhaoran, Sun, Will Wei, Cheng, Guang

The recent emergence of reinforcement learning has created a demand for robust statistical inference methods for the parameter estimates computed using these algorithms. Existing methods for statistical inference in online learning are restricted to settings involving independently sampled observations, while existing statistical inference methods in reinforcement learning (RL) are limited to the batch setting. The online bootstrap is a flexible and efficient approach for statistical inference in linear stochastic approximation algorithms, but its efficacy in settings involving Markov noise, such as RL, has yet to be explored. In this paper, we study the use of the online bootstrap method for statistical inference in RL. In particular, we focus on the temporal difference (TD) learning and Gradient TD (GTD) learning algorithms, which are themselves special instances of linear stochastic approximation under Markov noise. The method is shown to be distributionally consistent for statistical inference in policy evaluation, and numerical experiments are included to demonstrate the effectiveness of this algorithm at statistical inference tasks across a range of real RL environments.

algorithm, approximation, bootstrap method, (16 more...)

arXiv.org Artificial Intelligence

2108.03706

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)